Analyses

Histogram

AlgebraOfGraphics.histogramFunction
histogram(; bins=automatic, datalimits=automatic, closed=:left, normalization=:none)

Compute a histogram. bins can be an Int to create that number of equal-width bins over the range of values. In that case, the range covered by the bins is defined by datalimits (defaults to the extrema of the data). Alternatively, bins can be a sorted iterable of bin edges. closed determines whether the the intervals are closed to the left or to the right. The histogram can be normalized by setting normalization. Possible values are:

  • :pdf: Normalize by sum of weights and bin sizes. Resulting histogram has norm 1 and represents a PDF.
  • :density: Normalize by bin sizes only. Resulting histogram represents count density of input and does not have norm 1.
  • :probability: Normalize by sum of weights only. Resulting histogram represents the fraction of probability mass for each bin and does not have norm 1.
  • :none: Do not normalize.

Weighted data is supported via the keyword weights.

Note

Normalizations are computed withing groups. For example, in the case of normalization=:pdf, sum of weights within each group will be equal to 1.

source
using AlgebraOfGraphics, CairoMakie
set_aog_theme!()

df = (x=randn(1000), y=randn(1000), z=rand(["a", "b", "c"], 1000))
specs = data(df) * mapping(:x, layout=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
specs = data(df) * mapping(:x, dodge=:z, color=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
specs = data(df) * mapping(:x, stack=:z, color=:z) * histogram(bins=range(-2, 2, length=15))
draw(specs)
data(df) * mapping(:x, :y, layout=:z) * histogram(bins=15) |> draw

Density

AlgebraOfGraphics.densityFunction
density(; datalimits=automatic, kernel=automatic, bandwidth=automatic, npoints=200)

Fit a kernel density estimation of data. Here, datalimits specifies the range for which the density should be calculated, and kernel and bandwidth are forwarded to KernelDensity.kde. npoints is the number of points used by Makie to draw the line

source
df = (x=randn(5000), y=randn(5000), z=rand(["a", "b", "c", "d"], 5000))
datalimits = ((-2.5, 2.5),)
xz = data(df) * mapping(:x, layout=:z) * AlgebraOfGraphics.density(; datalimits)
axis = (; ylabel="")
draw(xz; axis)
data(df) * mapping(:x, :y, layout=:z) * AlgebraOfGraphics.density(npoints=50) |> draw
specs = data(df) * mapping(:x, :y, layout=:z) *
    AlgebraOfGraphics.density(npoints=50) * visual(Surface)

draw(specs, axis=(type=Axis3, zticks=0:0.1:0.2, limits=(nothing, nothing, (0, 0.2))))

Frequency

df = (x=rand(["a", "b", "c"], 100), y=rand(["a", "b", "c"], 100), z=rand(["a", "b", "c"], 100))
specs = data(df) * mapping(:x, layout=:z) * frequency()
draw(specs)
specs = data(df) * mapping(:x, layout=:z, color=:y, stack=:y) * frequency()
draw(specs)
specs = data(df) * mapping(:x, :y, layout=:z) * frequency()
draw(specs)

Expectation

df = (x=rand(["a", "b", "c"], 100), y=rand(["a", "b", "c"], 100), z=rand(100), c=rand(["a", "b", "c"], 100))
specs = data(df) * mapping(:x, :z, layout=:c) * expectation()
draw(specs)
specs = data(df) * mapping(:x, :z, layout=:c, color=:y, dodge=:y) * expectation()
draw(specs)
specs = data(df) * mapping(:x, :y, :z, layout=:c) * expectation()
draw(specs)

Linear

AlgebraOfGraphics.linearFunction
linear(; interval=automatic, dropcollinear=false, npoints=200)

Compute a linear fit of y ~ 1 + x. An optional named mapping weights determines the weights. Use interval to specify what type of interval the shaded band should represent. Valid values of interval are :confidence delimiting the uncertainty of the predicted relationship, and :prediction delimiting estimated bounds for new data points. By default, this analysis errors on singular (collinear) data. To avoid that, it is possible to set dropcollinear=true. npoints is the number of points used by Makie to draw the line

source
x = 1:0.05:10
a = rand(1:7, length(x))
y = 1.2 .* x .+ a .+ 0.5 .* randn.()
df = (; x, y, a)
specs = data(df) * mapping(:x, :y, color=:a => nonnumeric) * (linear() + visual(Scatter))
draw(specs)

Smoothing

AlgebraOfGraphics.smoothFunction
smooth(; span=0.75, degree=2, npoints=200)

Fit a loess model. span is the degree of smoothing, typically in [0,1]. Smaller values result in smaller local context in fitting. degree is the polynomial degree used in the loess model. npoints is the number of points used by Makie to draw the line

source
x = 1:0.05:10
a = rand(1:7, length(x))
y = sin.(x) .+ a .+ 0.1 .* randn.()
df = (; x, y, a)
specs = data(df) * mapping(:x, :y, color=:a => nonnumeric) * (smooth() + visual(Scatter))
draw(specs)

This page was generated using Literate.jl.